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Abstract 

We propose a method to associate a differentiable Riemannian manifold to 
a generic many degrees of freedom discrete system which is not described by 
a Hamiltonian function. Then, in analogy with classical Statistical Mechanics, 
we introduce an entropy as the logarithm of the volume of the manifold. The 
geometric entropy so defined is able to detect a paradigmatic phase transition 
occurring in random graphs theory: the appearance of the ‘giant component’ 
according to the Erdos-Renyi theorem. 
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1 Introduction 

Thermodynamic phase transitions are examples of emergent phenomena in many 
degrees of freedom systems, described in the framework of Statistical Mechanics. 
The standard statistical ensembles measures relate macroscopic (thermodynamic) 
observables with microscopic degrees of freedom. The interactions among the mi¬ 
croscopic degrees of freedom - which can be either continuous or discrete (as in the 
case of spin models, vertex models, and so on) - are often described by a Hamil¬ 
tonian function (or a Hamiltonian operator in a quantum context) |IJ. But what 
about discrete systems, i.e. networks, undergoing a phase transition for which a 
microscopic Hamiltonian does not exist? 
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A paradigmatic example is represented by a random graphs model G(n, k ) de¬ 
vised by choosing with uniform probability a graph from the set of all graphs having 
N vertices and L edges [2] . We can think of a process evolving by adding the edges 
one at a time. When k has the same order of magnitude of n, the evolution from 
k = 0 to k = (”) yields, according to Erdos-Renyi theorem [3], a phase transition , 
revealing itself in a rapid growth with k of the size of the largest component (number 
of vertices fully connected by edges). Specifically, the structure of graphs when the 
expected degree of each of its vertices is close to 1, i.e. k ~ n/2, shows a jump: 
the order of magnitude of the size of the largest component of graphs rapidly grows, 
asymptotically almost surely, from logn to n, if k has the same order of magnitude 
of n. In fact, if k < n/2, as the process evolves, the components of a graph [the 
largest of them being a.a.s. of size O(logn)] merge mainly by attaching small trees; 
thus they grow slowly and quite smoothly. Nonetheless, at the same point of the 
process, the largest components become so large that it is likely for a new edge to 
connect two of them. Thus, fairly quickly, all the largest components of a graph 
merge into one giant component, much larger than any of the remaining ones [2j. It 
is worth noticing that this process represents the mean-field case of percolation (4J. 

Regarding G(n, k) as a statistical ensemble it is quite natural to employ tools 
from statistical mechanics, above all entropy, to analyze it. In Ref. [5] the Gibbs 
entropy of such random graphs was defined as 


S := In — 


n\ 



( 1 ) 


There, the configuration space was given by ( ^ ) graphs with labelled nodes. Due 
to their equiprobability, they have the same weight, chosen to be n! in order to 
account for all the labelling permutations of the nodes. Later, the perspective was to 
modify the Erdos-Renyi ensemble by introducing a functional weight which explicitly 
depends on the graph’s topology. In this way one can eventually characterize other 
classes of random graphs, like scale free or fixed degree sequence, as well. A research 
line that has been pursued in the last decade [6], also putting forward variants of 
the entropy measure 0 0- However, we may notice that the entropy Q as a 
function of the ratio k/n is unable itself to detect the phase transition occurring in 
the Erdos-Renyi ensemble. 

In the present work, focussing on the Erdos-Renyi ensemble, we propose a general 
method to associate a continuous mathematical object (a Riemannian manifold) to a 
generic discrete system, graph or network, thus allowing the definition of a geometric 
entropy which is able to detect a phase transition. Actually, we endow each network 
with a statistical Riemannian manifold. This can be obtained basically via two steps; 
first by understanding a network as an undirected graph without loops on the nodes, 
and account for links (weighted edges) between nodes expressed by the adjacency 
matrix A. Second, considering random variables as sitting on the vertices of a 
network, methods of information geometry [8] can be employed to lift the network 
to a statistical Riemannian manifold. In this way, we associate a configuration space 
to each network. Such a space consists of a subset of the linear vector space M m given 
by the parameters which characterize the joint probability distribution of the random 
variables sitting on the nodes of the network. Furthermore, this configuration space 
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is endowed with a Riemannian metric which is inspired by the well-known Fisher- 
Rao metric [8]. Then, in analogy with classical Statistical Mechanics, we define 
a geometric entropy as the logarithm of the volume of this manifold. Applied to 
Erdos-Renyi random graphs it proves very effective (as a function of the ratio k/n ) 
in detecting the appearance of the so called ‘giant component’ as well as any smooth 
function of order parameters within the framework of Statistical Mechanics [I]. 


2 Information geometric model 


Let us start considering a set of n real-valued random variables X \,..., X n charac¬ 
terized by the following multivariate Gaussian probability distribution 


p(x;0) 


1 

— ; = exp 

y( 2ir) n det C 



( 2 ) 


where x l = (x\ ,. .. ,x n ) G M n with t denoting the transposition and we have also 
assumed, without loss of generality, that all the mean values are zero. Furthermore, 
0 t = (0 1 ,... 6 m ) are the real valued parameters characterizing the above probability 
distribution function, i.e. the entries of the covariance matrix C. Hence m = - ( . 

Next we consider a family V of such probability distributions 


V = {p d =p(x-e)\9 t = (e 1 ,...9 m )eQ}, 


where 0 C M m and the mapping 0 —> pg is injective. Defined in such a way V is an 
m-dimensional statistical model on W 1 . The open set 0 is defined as follows 


0 = {0 G M m | C{6) > 0}, 


(3) 


and we call it the parameter space of the m-dimensional statistical model V. 

Assuming parametrizations which are C 00 we can turn V into a C°° differentiable 
manifold |8j. Then, given a point 6, the Fisher information matrix of V in 6 is the 
m x m matrix G(6) = [g^v], where the g, u entry is defined by 

9^(0):= dx p(x;0)d fl logp(x-d)d u \ogp(x]0), (4) 

J R" 

with dn standing for . The matrix G(0) is symmetric, positive semidefinite and 
endows the parameter space 0 with a Riemannian metric [9]. 

We highlight that the integral of Eq. 0 with ([2]) is a Gaussian one and amounts 
to 


exp 



i,j =1 


d d 

C * J dxi dxj 


ft 


fU/\X=0l 


(5) 


where the exponential stands for a power series expansion over its argument (the 
differential operator) and 


f,xv ■= df, log[p(x; 0)] d v log [p(x; 0)]. 


( 6 ) 
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The derivative of the logarithm has the following expression 


dfi log[p(x; 0)] 


1 

2 


<9 M (det C) 
detC 


n 

+ M C ap) X <* X P » 

a,/3=l 


( 7 ) 


where c a ^ denotes the a/3 entry of the inverse of the covariance matrix C in ([2]). 
The latter equation together with Eq. © show the computational complexity of 
the Eq. Q. Indeed, the well-known formulas 

d tl C~ 1 (6) = c- 1 (9)(d fl c(9))c-\e) 

Oddest C{9)) = detC(9)Tr{C(9)d l ,(C{9))) 


require the calculation of n(n + 1) derivatives with respect to the variables 9 6 0 
of Eq. ([3]) in order to work out the derivative of the logarithm in ([7]). Finally, to 
obtain the function in ([6]), we have to evaluate 0(n 4 ) derivatives. This quickly 
becomes an unfeasible task with growing n, even numerically. 

In order to overcome the difficulty of explicitly computing the components of 
the Fisher-Rao metric, we proceed by defining a new (pseudo)-Riemannian metric 
on the parameter space 0 which account as well for the network structure given by 
the adjacency matrix A. 

To this end we consider first a trivial network with null adjacency matrix and 
interpret n independent Gaussian random variables Xi as sitting on its vertices. In 
this particular case, the joint probability distribution ([ 2 ]) is given with a diagonal 
covariance matrix with entries given by 9 l := E(X 2 ). Let us denote this matrix as 
Cq(9). So, making use of Eqs. ([3]) and Q, the statistical Riemannian manifold 
Xi = ( 0,g ), with [10J 


-1 < v -| 2 

© = {0= (0 1 ,...,r)|0 i >0}, 5 = 2 E(hf) del ® d0i ( 8 ) 

i =1 


is associated to the bare network. 

Given the matrix Cq(9) = diag [0 1 ,..., 0 n ], from Q it is evident that ga = 
i^ 1 ) 2 , where is the ii entry of the inverse matrix of Cq(9) given by c^ 1 = 4 -, 
for all i G {1,..., n}. Inspired by this functional form of g, we propose to associate 
a (pseudo)-Riemannian manifold to any network X with non vanishing adjacency 
matrix A. To this aim, we consider the map ipc 0 ■ A(n,M) —> GL(n,M) defined by 


iPc 0 (e)(A) := Co(9) + A, (9) 

with A(n, M) denoting the set of the symmetric nxn matrices over M with vanishing 
diagonal elements that can represent any simple undirected graph. Then, we deform 
the manifold Xi in ([8]) via ipco ■ Hence the manifold associated to a network X with 
adjacency matrix A is Xi = (0, g) with 

© := {9 e 0 | tpCg^iA) is non-degenerate} (10) 

and g = 9iJ^ ddl ' <S> d9 v with components 

9^ = 2^Co(6) O 4 )^ 1 ) 2 , (11) 
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where ipc 0 (d)(-^-)uu is the fiv entry of the inverse of the matrix ipCo(6)(A)■ 

In this way, we associated a differentiable system (Riemannian manifold) to a 
discrete system (network) through the description of network by a set of probability 
distribution functions. Some other ways to describe a network with probabilistic 
methods are also employed in literature. Among them it is worth mentioning the 
random walk method m- Here the Green function, meaning the transition ampli¬ 
tude from one vertex to another by accounting for all possible paths, gives rise to 
a metric [12], thus allowing as well for a geometric approach. However the main 
difference is that in such a case one deals with a stationary transition probability 
originating from the adjacency matrix HU, in our case beside adjacency matrix also 
the variances of a Gaussian distribution of random variables sitting on nodes of the 
network play a role (namely a family of Gaussian distributions). 

3 A geometric entropy 

We now define a geometric entropy of a network X with adjacency matrix A and 
associated manifold At = (0, g) as 


5 := lnV(A), 


( 12 ) 


where V(A) is the volume of Ai evaluated from the element 

u g = y/\ detg(9)\ dO l A ... A d9 n . 


(13) 


Notice, however, that in such a way V(A) is ill-defined. Thus we regularize it as 
follows 


V(A) := lr^ cm (A))u g , 


(14) 


/8 


where T (V’c o (0)(A) * s an y suitable ’’infrared” and ’’ultraviolet” regularizing func¬ 
tion; tpc 0 ^{A) and v g are given in Q and (13), respectively. The need for reg¬ 


ularization is twofold: the set 0 in Eq.(10) is not compact because the variables 
9 l are unbounded from above; furthermore, from Eq. GD. det g(9) diverges since 
det i/jc 0 (e)(A) approaches zero for some 9 l . A possible choice of T has recently been 
defined [10], 

T (C{9)) := e - Tr ( c ( 0 )) log [1 + (det C(9)) n ] , (15) 

where C is the covariance matrix in ([2]) when off-diagonal entries are 1 or 0. In the 
present work we would extent such a regularizing function to a more general kind of 
networks, taking also into account the weights of links between vertices. However, 


the functional type should be still like in (15) 


The definition (12) is inspired by the microcanonical definition of entropy S 


in Statistical Mechanics, that is S := ks In 12(E), where Vt{E) is the phase space 
volume bounded by the hypersurface of constant energy E. After integration on the 
momenta one finds S = ks In w f M [E — V(q±, ..., q n )} n ^’ 2 dq 1 A... A dq n , where vo is 
a constant stemming from the integration on the momenta, Me is the configuration 
space subset bounded by the equipotential level set E = V{q\, ..., q n ), and q\. .... q n 
are the configurational coordinates. Now, the term [E — V(qi, ..., q n )] n ^ 2 is just 
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Vdet g,j , with gj the Jacobi kinetic energy metric whose associated geodesic flow 
coincides with the underlying Hamiltonian flow [Tj. In the end the microcanonical 
entropy is S = ks In J M \J det gj dq 1 A. . .A dq n + kB In to, that is proportional to the 
logarithm of the volume of the Riemannian manifold associated with the underlying 
dynamics. 

As already stated at the beginning of this paper, in order to assess the interest of 
the proposed geometric entropy in Eq. (12) we check it against a system undergoing 
the classical Erdos-Renyi phase transition in random graphs 13 El. 


4 Numerical results 


We numerically compute S(k), the geometric entropy in Eq.( 12) vs k for a fixed n in 
order to investigate its sensitivity to the appearance of the giant component during 
the evolution of the random graph model G(n, k). 

To this aim, we consider four different numbers of vertices: n = 25, 50,100, 200. 
The choice of n determines the dimension of the associated manifold A4. Then, for 
a given n, we choose the number of links k, with k = 0, 1,..., n(n — 1)/2. Next, for 
a given pair (n, k) we generate at random a set of k entries with i < j, of the 

non-vanishing adjacency matrix elements A t j. 

Hence, since the covariance matrix C is functionally assigned, we get ij>c(A) of 
Eq. Q and finally the metric Tj of Eq.(11). Now, having determined A4 = (0, g), we 
can compute the volume V (A) in Eq. (14) and the entropy S of Eq. (12). In numerical 
computations the volume regularization is performed in two steps, the first one is 
by restricting the manifold support © C to an hypercube. Inside 0 we generate 
a Markov chain, to perform a Monte Carlo estimate of the average 


( 



f -y/det Tg dd 1 A ... A dd n 
f dO 1 A ... A d6 n 


The number of random configurations considered varies between 10 1 and 10 6 ; the 
second step of the regularization procedure of the volume is obtained by excluding 
those points where the value of ^det g exceeds 10 308 (the numerical overflow limit 
of the computers used). For any given pair (n,k) this computational scheme is 
repeated 10 3 times, each time considering a different randomly generated realisation 
of the adjacency matrix A. Thus the final values of the entropy S are obtained as 
averages on these 10 3 different manifolds A4. In the Figures [l] and [ 2 ] we report the 
Monte Carlo numerical estimates of 


S(k) 


((s(k)-sm 

n 

1 J \J det g dd 1 A ... A d6 n 

n\ J y/det g dd 1 A ... A dd n 


(16) 


repeated for different values of k; here (•) stands for the above mentioned aver¬ 
age over the different realisations of the adjacency matrix A, and g is the metric 
corresponding to the null adjacency matrix. 
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Figure 1: (Color on line) Geometric entropy S of C(25, k) (magenta points), 0(50, k) 
(black points), 0(100, k) (red points) and 0(200, k) (blue points) networks as a 
function of the number k of randomly chosen links of weights equal to r = 0.2. 
The black solid line is a guide to the eye coming from a linear fitting of a linear- 
logarithmic presentation of the data. 



Figure 2: (Color on line) Geometric entropy S of 0(50, k ) networks as a function of 
the number k of randomly chosen links of weights equal to r = 0.1 (green points), 
r = 0.2 (red points) and r = 0.4 (black points). 
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For all the four cases reported in Figure [I] we have equal weights Aij = r for all 
the k non-vanishing links. 


The reason for displaying S of Eq. (16) instead of S in (12) and versus k/n 
instead of k is that one obtains what in the context of statistical mechanics is called 
a collapse plot of the results obtained at different n-values. The corresponding 
points crowd on a common pattern for large k whereas for k/n ranging from 0 to 
approximately 1 the patterns obtained show a phenomenon which is familiar in the 
context of numerical investigations of second order phase transitions: as in the case 
of finite-size effects observed for the order parameter, what asymptotically would 
be a sharp bifurcation is rounded at finite n |T]; however, the larger n the more 
pronouced the ‘knee’ of S(k/n ) in the range (0,1). This is clearly in excellent 
agreement with an n-asymptotic bifurcation at k/n = 0.5 (marked by the solid line) 
where the Erdos-Renyi phase transition takes place. 


In Figure [2]we report the outcomes for G(50, k ) having set all the non-vanishing 
entries Aij of the adjacency matrix again equal to a constant value r. For r = 0.1 
a considerable softening of the shape of S(k/n) is observed; this is of course an 
expected result since for r —> 0 the transition must disappear. For r = 0.2 and 
r = 0.4 the shapes of S(k/n ) look almost the same, the only interesting difference 
being a slightly more pronounced ‘knee’ in the r = 0.4 case, which, going in the 
opposite direction, is coherent with the previous ones. 


Another interesting property of this entropy consists in its versatility, that is, it 
can be easily adapted to more refined descriptions of networks/graphs. 


For example, we can consider a refined modeling of graphs where the entries 
of the n x n adjacency matrix A are given by terms of the form rij6 r 8 :! . Here the 
TijS ( i,j = 1,... ,n) are the weights of the links between the nodes of the network 
described by A. Furthermore, the 6 l s (i = l,...,n) are local coordinates on the 
manifold M of Eqs. (10) and representing the variances of the random variables 

on the nodes of the network. 


This kind of model has an interesting property: the closer a given variable 9 l to 
zero, the weaker the weights of all the links associated to the i -th node. In such a 
way, this second model, allows one to describe a more general class of networks. In 
fact, consider for example the flow of some quantity across a network, the vanishing 
of the flow on a given node i implies that all the other nodes connected to it become 
effectively independent of it. In view of this argument, we consider the entropy S 
in Eq. (12) against the more general model just described above. In Figure [3] it 
is shown S versus k/n where n = 50 is the dimension of the manifold associated 
to the network, and k = 0,... ,n(n — l)/2 is the number of non-vanishing rjj (all 
of them are chosen equal to 1), for i. j = 1,... ,n and i < j. Also in this case our 
entropy detects the phase-transition predicted by the Erdos-Renyi theorem occuring 
at k/n = 0.5. The pattern of S(k/n) reported in Figure [ 3 ] shows a more pronounced 
’’knee” at the asymptotic transition value k/n = 0.5 with respect to what is found 
for the same n value and is reported in Figures [T] and [2] 
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Figure 3: (Color on line) Geometric entropy of G(50,fc) networks as a function of 
the number k of randomly chosen non-vanishing adjacency matrix elements A % j of 
the form rij6 l 6 :) with rij = 0.2. 


5 Conclusion 

Summarizing, the present work puts forward a novel entropic functional useful to 
characterize probabilistic graph models. It is inspired to Statistical Mechanics, how¬ 
ever, instead of being modeled on the Boltzmann entropy it is rather modeled on 
the microcanonical ensemble definition of entropy. The phase space volume being 
replaced by the volume of a ‘state manifold’ (that is a Riemannian manifold whose 
points correspond to all the possible states of a given network). The state man¬ 
ifold is defined through a suitable metric which is borrowed from an information 
geometry framework. The result is a constructive way of associating a differentiable 
and handy mathematical object to any simple undirected and weighted graph or 
network. 

Notice that a similar way of associating a probability distribution to a network, 
is that of probabilistic graphs models [13] . Here the choice of Gaussian probability 
distributions is motivated by the fact that Gaussian networks are extensively used in 
many applications ranging from neural networks, to wireless communication, from 
proteins to electronic circuits, and so on. 

The most relevant property of the proposed entropic-geometric measure is its 
ability to detect the phase transition which is rigorously predicted by the Erdos- 
Renyi theorem for random graphs: a paradigmatic example of an analytically known 
emergent phenomenon occurring in a network. This effect shows up very clearly, in 
fact, the geometric entropy proposed here displays both the pattern (as a function of 
a control parameter) and the finite-size-dependence which are typically displayed by 
the order parameter of a second-order phase transition in physics. As natural, though 
non-trivial, extension our entropic-geometric measure could be applied to pseudo¬ 
random graphs and dense graphs where the emergence of the giant component has 
been recently proved mm- 

Finally, the differential-geometric framework proposed opens some fascinating 
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perspectives of application to the study of complex networks USE!. As matter of 
fact the introduced geometric entropic measure could account for both the structural 
complexity of a given network and for its statistical complexity, that is, for the 
complexity of the probability distributions of the entities constituting the network. 
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